MSBR 70310 Final Project

Author

Madison Bradley

import pandas as pd
import numpy as np
import geopandas as gpd
import pandas as pd
from shapely.geometry import Polygon
import requests
import time
import glob as glob
import pymcdm.methods as mcdm

Introduction

Zipline is an American technology company that designs, builds, and operates autonomous drones. In 2016, Zipline launched a first-of-its-kind partnership with the Rwandan government to use the drones to deliver blood to rural health facilities. The result was a 51% decrease in mortality from post-partum hemorrhage 1.

Since then, Zipline has expanded to Ghana, Nigeria, Kenya, and Cote d’Ivoire delivering blood, vaccines, and other essential medicines that are difficult to stock in rural areas. This project explores the hypothetical expansion of Zipline into Zambia.

Zambia’s development status is similar to that of Zipline’s existing partners. The country ranks 153rd out of 193 countries according to the Human Development Index 2. Eastern Province is Zambia’s third most populous province and borders relatively stable neighboring countries Malawi and Mozambique 3. Road infrastructure is poor, and in the rainy season, many roads in rural areas become impassable 4.

The following exercise attempts to answer the question: If Zipline were to build a hub in Eastern Province, Zambia, where precisely should the hub be located to maximize impact? We measure impact based on number of facilities and population served and improvement over road transportation. Operational feasibility is considered based on cellular network coverage, which the drones require for communication with the hub.

Set Up and Visualization

Part 1: Zambia Administrative Boundaries

Source: GADM (https://gadm.org/download_country.html)

We begin by downloading and visualizing the administrative boundaries of Zambia in ArcGIS. We then subset these shapefiles to Eastern Province for further analysis.

Administrative Borders of Zambia

Eastern Province, Zambia

Eastern Province, Zambia (zoomed)

Part 2: Zambia Roads

Source: OpenStreetMap (https://data.humdata.org/dataset/hotosm_zmb_roads?force_layout=desktop)

To visualize urbanicity and connectivity, we download and visualize all roadways in Zambia in ArcGIS. Again, we then subset these shapefiles to Eastern Province.

Roads in Eastern Province

Part 3: Eastern Province Health Facilities

Source: Zambia Ministry of Health Master Facility List (https://mfl.moh.gov.zm/)

Health facilities in Zambia’s Eastern Province are restocked by ZAMMSA, the Zambia Medicines and Medical Supplies Agency, from a regional hub in Chipata 5. Deliveries of standard products to multiple faciltiies are optimized along a single route using the Dynamic Routing Tool developed by Chemonics International on behalf of USAID 6.

Unlike standard products, specialty products, such as blood products and temperature-sensitive medications, are difficult for rural facilities to keep stocked. They likely manage their stock levels by ordering from the nearest hospital i. Zipline drones are used to deliver these products to rural health facilities, saving travel time and ensuring product availability.

We download the Master Facility List for Eastern Province from the Zambia Ministry of Health and visualize these facilities in ArcGIS. We then calculate the distance and travel time by road from each facility to its official resupply hub in Chipata. Taken as a direct measurement, this value represents the travel time saved over direct delivery from the hub, which is unlikely for standard or even specialty products. Rather, we suggest that this value be understood as the maximum travel time saved by using drones and a proxy metric for isolation. Zipline drones are better suited to resupply rural facilities, as the travel time saved by drones in urban areas is minimal.

i Assumed as is standard practice. Actual resupply routes are not publically available.

Step 1: Load the Master Facility List (MFL) data.

# Load
MFL = pd.read_excel('C:/Users/mm-br/OneDrive - nd.edu/Documents/Mod 3/3 - Unstructured Data Analysis/Final Project/mfl_facilities_export20250217013607.xlsx')

# View
print(MFL.head())
print(len(MFL))

# Prepare latitude and longitude columns for spatial operations, remove spaces and convert to numeric
MFL['Latitude'] = pd.to_numeric(MFL['Latitude'].astype(str).str.replace(' ', '', regex=True), errors='coerce')
MFL['Longitude'] = pd.to_numeric(MFL['Longitude'].astype(str).str.replace(' ', '', regex=True), errors='coerce')
   MFL Code    DHIS2 UID Hims code                           Name Province  \
0      2564  giU2oIx29d9  30040029          Mankhaka  Health Post  Eastern   
1      1901          NaN  30030016       Kawaza Rural Health Post  Eastern   
2      2004  sBDT0IlHEjL  30010001  Bwanunkha Rural Health Centre  Eastern   
3      1823  wfrDFLlBldb  30060004   Chinambi Rural Health Centre  Eastern   
4      2103  YyEYUwT9aPn    302527      Mnoro Rural Health Centre  Eastern   

     District Constituency      Ward Zone Location  ... Ownership  \
0     Lundazi      Lundazi   Lunevwa  NaN    Rural  ...       MOH   
1      Katete      Milanzi   Milanzi  NaN    Rural  ...       MOH   
2     Chadiza      Chadiza     Manje  NaN    Rural  ...       MOH   
3      Nyimba       Nyimba  Chinambi  NaN    Rural  ...       MOH   
4  Chipangali   Chipangali  Msandile  NaN    Rural  ...       MOH   

  Ownership type Operational status Mobility status Accesibility  \
0         Public         Functional           Fixed         Open   
1         Public         Functional           Fixed         Open   
2         Public         Functional           Fixed         Open   
3         Public         Functional           Fixed         Open   
4         Public         Functional           Fixed         Open   

  Catchment population head count  Catchment population cso  \
0                             NaN                    8153.0   
1                             NaN                    3524.0   
2                          7718.0                    5156.0   
3                          6121.0                    4924.0   
4                         10984.0                    9672.0   

   Number of households   Latitude  Longitude  
0                   NaN -12.288883  33.174417  
1                   NaN -14.256272  32.164201  
2                   5.0 -13.934065  32.576443  
3                   NaN  -14.46886  30.670398  
4                   NaN -13.525668  32.646855  

[5 rows x 21 columns]
404

Step 2: Subset the MFL data to only include facilities in Eastern Province.

Visualization of the raw MFL export in ArcGIS revealed that, despite having downloaded data specifically for Eastern Province, many facility coordinates fell outside the province’s boundaries. To limit our data to the province of interest and avoid errors in later clustering operations, we use the Eastern Province shapefile to filter the MFL data.

# Load Eastern Province shapefile as GeoDataFrame
EasternProvince_gdf = gpd.read_file('C:/Users/mm-br/OneDrive - nd.edu/Documents/Mod 3/3 - Unstructured Data Analysis/Final Project/ZAM - IS - Eastern Province.shp')

# Convert MFL to GeoDataFrame to allow spatial operations
# Convert latitude and longitude columns to point geometry
MFL_gdf = gpd.GeoDataFrame(MFL, geometry=gpd.points_from_xy(MFL.Longitude, MFL.Latitude))

# Ensure both GeoDataFrames have the same CRS (Coordinate Reference System)
MFL_gdf = MFL_gdf.set_crs(EasternProvince_gdf.crs, allow_override=True)

# Filter facilities based on Eastern Province boundaries
geosubset_MFL_gdf = MFL_gdf[MFL_gdf.geometry.within(EasternProvince_gdf.unary_union)]

# Convert back to DataFrame, remove geometry
MFL_V1 = pd.DataFrame(geosubset_MFL_gdf.drop(columns='geometry'))

# View
print(MFL_V1.head())
print(len(MFL_V1))

# Save
MFL_V1.to_csv('C:/Users/mm-br/OneDrive - nd.edu/Documents/Mod 3/3 - Unstructured Data Analysis/Final Project/MFL_V1.csv', index = False)
C:\Users\mm-br\AppData\Local\Temp\ipykernel_40356\614315478.py:12: DeprecationWarning:

The 'unary_union' attribute is deprecated, use the 'union_all()' method instead.
   MFL Code    DHIS2 UID Hims code                           Name Province  \
0      2564  giU2oIx29d9  30040029          Mankhaka  Health Post  Eastern   
1      1901          NaN  30030016       Kawaza Rural Health Post  Eastern   
2      2004  sBDT0IlHEjL  30010001  Bwanunkha Rural Health Centre  Eastern   
3      1823  wfrDFLlBldb  30060004   Chinambi Rural Health Centre  Eastern   
4      2103  YyEYUwT9aPn    302527      Mnoro Rural Health Centre  Eastern   

     District Constituency      Ward Zone Location  ... Ownership  \
0     Lundazi      Lundazi   Lunevwa  NaN    Rural  ...       MOH   
1      Katete      Milanzi   Milanzi  NaN    Rural  ...       MOH   
2     Chadiza      Chadiza     Manje  NaN    Rural  ...       MOH   
3      Nyimba       Nyimba  Chinambi  NaN    Rural  ...       MOH   
4  Chipangali   Chipangali  Msandile  NaN    Rural  ...       MOH   

  Ownership type Operational status Mobility status Accesibility  \
0         Public         Functional           Fixed         Open   
1         Public         Functional           Fixed         Open   
2         Public         Functional           Fixed         Open   
3         Public         Functional           Fixed         Open   
4         Public         Functional           Fixed         Open   

  Catchment population head count  Catchment population cso  \
0                             NaN                    8153.0   
1                             NaN                    3524.0   
2                          7718.0                    5156.0   
3                          6121.0                    4924.0   
4                         10984.0                    9672.0   

   Number of households   Latitude  Longitude  
0                   NaN -12.288883  33.174417  
1                   NaN -14.256272  32.164201  
2                   5.0 -13.934065  32.576443  
3                   NaN -14.468860  30.670398  
4                   NaN -13.525668  32.646855  

[5 rows x 21 columns]
343

Step 3: Calculate resupply distance and travel time by road.

Note: Exact coordinates of ZAMMSA Chipata hub are not available, so we use the coordinates of Chipata Central Hospital as a proxy.

# Load MFL_V1
MFL_V1 = pd.read_csv('C:/Users/mm-br/OneDrive - nd.edu/Documents/Mod 3/3 - Unstructured Data Analysis/Final Project/MFL_V1.csv')
# Use OpenRouteService Reverse Geocoding API to return nearest routable coordinates for each facility
geocodes = []

for i in range(len(MFL_V1)):
  if (i + 1) % 100 == 0:
    time.sleep(120)
  api_key = '5b3ce3597851110001cf6248df4e0aade92844a8bf157de2e91818f5'
  point_lat = str(MFL_V1.iloc[i]['Latitude'])
  point_lon = str(MFL_V1.iloc[i]['Longitude'])
  link = (
    'https://api.openrouteservice.org/geocode/reverse?api_key={}&point.lon={}'
    '&point.lat={}&size=1&boundary.country=ZMB'.format(api_key, point_lon, point_lat))
  geocode_request = requests.get(link)
  geocodes.append(geocode_request.json()['features'][0]['geometry']['coordinates'])
# Find Chipata General Hospital georeference
api_key = '5b3ce3597851110001cf6248df4e0aade92844a8bf157de2e91818f5'
point_lat = '-13.641615049376906'
point_lon = '32.63751062298343'
link = (
  'https://api.openrouteservice.org/geocode/reverse?api_key={}&point.lon={}'
  '&point.lat={}&size=1&boundary.country=ZMB'.format(api_key, point_lon, point_lat))
geocode_request = requests.get(link)
ChipataGeneralHospital = (geocode_request.json()['features'][0]['geometry']['coordinates'])
ChipataGeneralHospital_lon = ChipataGeneralHospital[0]
ChipataGeneralHospital_lat = ChipataGeneralHospital[1]
ChipataGeneralHospital = str(ChipataGeneralHospital_lon) + ',' + str(ChipataGeneralHospital_lat)
# Convert to DataFrame
geocodes = pd.DataFrame(geocodes, columns=['Longitude', 'Latitude'])

# Save
geocodes.to_csv('C:/Users/mm-br/OneDrive - nd.edu/Documents/Mod 3/3 - Unstructured Data Analysis/Final Project/geocodes.csv')
# Load geocodes
geocodes = pd.read_csv('C:/Users/mm-br/OneDrive - nd.edu/Documents/Mod 3/3 - Unstructured Data Analysis/Final Project/geocodes.csv')

# View
print(geocodes.head())
print(len(geocodes))

# Add to MFL_V1
MFL_V1['Geocoded_Lon'] = geocodes['Longitude']
MFL_V1['Geocoded_Lat'] = geocodes['Latitude']

# Save
MFL_V1.to_csv('C:/Users/mm-br/OneDrive - nd.edu/Documents/Mod 3/3 - Unstructured Data Analysis/Final Project/MFL_V2.csv', index = False)
   Unnamed: 0  Longitude   Latitude
0           0  33.177151 -12.288183
1           1  31.946058 -14.093715
2           2  32.575007 -13.932266
3           3  30.581171 -14.379737
4           4  32.646664 -13.525717
343
# Load MFL_V2
MFL_V2 = pd.read_csv('C:/Users/mm-br/OneDrive - nd.edu/Documents/Mod 3/3 - Unstructured Data Analysis/Final Project/MFL_V2.csv')

# View
print(MFL_V2.head())
   MFL Code    DHIS2 UID Hims code                           Name Province  \
0      2564  giU2oIx29d9  30040029          Mankhaka  Health Post  Eastern   
1      1901          NaN  30030016       Kawaza Rural Health Post  Eastern   
2      2004  sBDT0IlHEjL  30010001  Bwanunkha Rural Health Centre  Eastern   
3      1823  wfrDFLlBldb  30060004   Chinambi Rural Health Centre  Eastern   
4      2103  YyEYUwT9aPn    302527      Mnoro Rural Health Centre  Eastern   

     District Constituency      Ward Zone Location  ... Operational status  \
0     Lundazi      Lundazi   Lunevwa  NaN    Rural  ...         Functional   
1      Katete      Milanzi   Milanzi  NaN    Rural  ...         Functional   
2     Chadiza      Chadiza     Manje  NaN    Rural  ...         Functional   
3      Nyimba       Nyimba  Chinambi  NaN    Rural  ...         Functional   
4  Chipangali   Chipangali  Msandile  NaN    Rural  ...         Functional   

  Mobility status Accesibility Catchment population head count  \
0           Fixed         Open                             NaN   
1           Fixed         Open                             NaN   
2           Fixed         Open                          7718.0   
3           Fixed         Open                          6121.0   
4           Fixed         Open                         10984.0   

  Catchment population cso Number of households   Latitude  Longitude  \
0                   8153.0                  NaN -12.288883  33.174417   
1                   3524.0                  NaN -14.256272  32.164201   
2                   5156.0                  5.0 -13.934065  32.576443   
3                   4924.0                  NaN -14.468860  30.670398   
4                   9672.0                  NaN -13.525668  32.646855   

   Geocoded_Lon  Geocoded_Lat  
0     33.177151    -12.288183  
1     31.946058    -14.093715  
2     32.575007    -13.932266  
3     30.581171    -14.379737  
4     32.646664    -13.525717  

[5 rows x 23 columns]
# Use OpenRouteService Directions API to calculate distance/time from Chipata General Hospital to each facility
distance = []
duration = []

for i in range(len(MFL_V2)):
  if (i + 1) % 40 == 0:
    time.sleep(80)
  api_key = '5b3ce3597851110001cf6248df4e0aade92844a8bf157de2e91818f5'
  start = ChipataGeneralHospital
  end = (
    str(MFL_V2.iloc[i]['Geocoded_Lon']) + 
    ',' + 
    str(MFL_V2.iloc[i]['Geocoded_Lat']))
  link = (
    'https://api.openrouteservice.org/v2/directions/driving-car?'
    'api_key={}&start={}&end={}'.format(api_key, start, end))
  route_request = requests.get(link)
  if 'features' in route_request.json():
    distance.append(route_request.json()['features'][0]['properties']['segments'][0]['distance'])
    duration.append(route_request.json()['features'][0]['properties']['segments'][0]['duration'])
  else:
    distance.append('')
    duration.append('')
# Add distance and duration to MFL_V2
distance = pd.to_numeric(distance)
duration = pd.to_numeric(duration)
MFL_V2['Distance_KM'] = distance / 1000
MFL_V2['Duration_HR'] = duration / 3600

# Save
MFL_V2.to_csv('C:/Users/mm-br/OneDrive - nd.edu/Documents/Mod 3/3 - Unstructured Data Analysis/Final Project/MFL_V3.csv', index = False)

Step 4: Clean and finalize facility list.

# Load MFL_V3
MFL_V3 = pd.read_csv('C:/Users/mm-br/OneDrive - nd.edu/Documents/Mod 3/3 - Unstructured Data Analysis/Final Project/MFL_V3.csv')

# View
print(MFL_V3.head())
   MFL Code    DHIS2 UID Hims code                           Name Province  \
0      2564  giU2oIx29d9  30040029          Mankhaka  Health Post  Eastern   
1      1901          NaN  30030016       Kawaza Rural Health Post  Eastern   
2      2004  sBDT0IlHEjL  30010001  Bwanunkha Rural Health Centre  Eastern   
3      1823  wfrDFLlBldb  30060004   Chinambi Rural Health Centre  Eastern   
4      2103  YyEYUwT9aPn    302527      Mnoro Rural Health Centre  Eastern   

     District Constituency      Ward Zone Location  ... Accesibility  \
0     Lundazi      Lundazi   Lunevwa  NaN    Rural  ...         Open   
1      Katete      Milanzi   Milanzi  NaN    Rural  ...         Open   
2     Chadiza      Chadiza     Manje  NaN    Rural  ...         Open   
3      Nyimba       Nyimba  Chinambi  NaN    Rural  ...         Open   
4  Chipangali   Chipangali  Msandile  NaN    Rural  ...         Open   

  Catchment population head count Catchment population cso  \
0                             NaN                   8153.0   
1                             NaN                   3524.0   
2                          7718.0                   5156.0   
3                          6121.0                   4924.0   
4                         10984.0                   9672.0   

  Number of households   Latitude  Longitude  Geocoded_Lon  Geocoded_Lat  \
0                  NaN -12.288883  33.174417     33.177151    -12.288183   
1                  NaN -14.256272  32.164201     31.946058    -14.093715   
2                  5.0 -13.934065  32.576443     32.575007    -13.932266   
3                  NaN -14.468860  30.670398     30.581171    -14.379737   
4                  NaN -13.525668  32.646855     32.646664    -13.525717   

   Distance_KM  Duration_HR  
0     183.4942     3.030917  
1      96.7518     1.184139  
2      38.7530     1.233583  
3          NaN          NaN  
4      17.7136     0.319778  

[5 rows x 25 columns]
# Remove unnecessary columns
MFL_V3 = MFL_V3.drop(columns=['MFL Code', 'DHIS2 UID', 'Hims code', 'Zone', 'Constituency', 'Ward', 'Location', 'Mobility status', 'Accesibility', 'Operational status', 'Catchment population head count', 'Number of households'])

# Note sparsity
routing_sparsity = (MFL_V3[MFL_V3['Distance_KM'] == ''].sum()) / len(MFL_V3)

# Save
MFL_V3.to_csv('C:/Users/mm-br/OneDrive - nd.edu/Documents/Mod 3/3 - Unstructured Data Analysis/Final Project/MFL_VF.csv', index = False)
# Load MFL_VF
MFL_VF = pd.read_csv('C:/Users/mm-br/OneDrive - nd.edu/Documents/Mod 3/3 - Unstructured Data Analysis/Final Project/MFL_VF.csv')

# View
print(MFL_VF.head())
                            Name Province    District           Type  \
0          Mankhaka  Health Post  Eastern     Lundazi    Health Post   
1       Kawaza Rural Health Post  Eastern      Katete    Health Post   
2  Bwanunkha Rural Health Centre  Eastern     Chadiza  Health Centre   
3   Chinambi Rural Health Centre  Eastern      Nyimba  Health Centre   
4      Mnoro Rural Health Centre  Eastern  Chipangali  Health Centre   

  Ownership Ownership type  Catchment population cso   Latitude  Longitude  \
0       MOH         Public                    8153.0 -12.288883  33.174417   
1       MOH         Public                    3524.0 -14.256272  32.164201   
2       MOH         Public                    5156.0 -13.934065  32.576443   
3       MOH         Public                    4924.0 -14.468860  30.670398   
4       MOH         Public                    9672.0 -13.525668  32.646855   

   Geocoded_Lon  Geocoded_Lat  Distance_KM  Duration_HR  
0     33.177151    -12.288183     183.4942     3.030917  
1     31.946058    -14.093715      96.7518     1.184139  
2     32.575007    -13.932266      38.7530     1.233583  
3     30.581171    -14.379737          NaN          NaN  
4     32.646664    -13.525717      17.7136     0.319778  

Health Facilities in Eastern Province

Part 4: Cellular Network Coverage

Source: OpenCellID (https://www.opencellid.org/)

Zipline drones require cellular network coverage to operate 7. To estimate network coverage in Eastern Province, we pull cell tower data from OpenCellID’s API. The data includes tower latitude and longitude, radio (2G, 3G, 4G), and signal range, among other variables. For this exercise, we assume that all towers are operational and that 2G coverage is sufficient for Zipline drones.

Step 1: Subset Eastern Province into 0.015 x 0.015 degree grid cells.

OpenCellID’s API can only process requests for small areas. Based on the known maximum and minimum latitude and longitude values for Eastern Province (from ArcGIS), we create a grid of 0.015 x 0.015 degree cells and store the boundaries in a DataFrame.

# Define lat/lon range and step size
lat_range = np.round(np.arange(-15.0, -11.595, 0.015), 3)
lon_range = np.round(np.arange(30.0, 33.585, 0.015), 3)

# Create list to store grid cell boundaries
cells = []

# Loop through the latitude and longitude ranges to pull grid cell boundaries
for i in range(len(lat_range) - 1):
    for j in range(len(lon_range) - 1):
        latmin = lat_range[i]
        latmax = lat_range[i + 1]
        lonmin = lon_range[j]
        lonmax = lon_range[j + 1]
        cells.append([latmin, latmax, lonmin, lonmax])

# Create DataFrame from list
cells = pd.DataFrame(cells, columns=["latmin", "latmax", "lonmin", "lonmax"])

# View DataFrame
print(cells.head())
print(len(cells))
   latmin  latmax  lonmin  lonmax
0   -15.0 -14.985  30.000  30.015
1   -15.0 -14.985  30.015  30.030
2   -15.0 -14.985  30.030  30.045
3   -15.0 -14.985  30.045  30.060
4   -15.0 -14.985  30.060  30.075
54014

Step 2: Reduce the grid cells to only those that are within the Eastern Province boundary shapefile.

Because Eastern Province is not a perfect square, many of the grid cells generated in Step 1 are outside of the province’s boundaries. To reduce the number of calls to the OpenCellID API, we will remove the grid cells that are not within the Eastern Province shapefile.

# Load Eastern Province shapefile as GeoDataFrame
EasternProvince_gdf = gpd.read_file('C:/Users/mm-br/OneDrive - nd.edu/Documents/Mod 3/3 - Unstructured Data Analysis/Final Project/ZAM - IS - Eastern Province.shp')

# Write function to convert cell boundaries to polygons by using lat/lon min/max to define cell corners
def create_polygon(row):
    return Polygon([(row['lonmin'], row['latmin']),
                    (row['lonmin'], row['latmax']),
                    (row['lonmax'], row['latmax']),
                    (row['lonmax'], row['latmin'])])

# Apply the function to create polygons for each grid cell and add them to a new column
cells['geometry'] = cells.apply(create_polygon, axis=1)

# Convert the cells DataFrame into a GeoDataFrame to allow spatial operations
cells_gdf = gpd.GeoDataFrame(cells, geometry='geometry')

# Ensure both GeoDataFrames have the same CRS (Coordinate Reference System)
cells_gdf = cells_gdf.set_crs(EasternProvince_gdf.crs, allow_override=True)

# Filter cells based on Eastern Province boundaries
geosubset_cells_gdf = cells_gdf[cells_gdf.geometry.within(EasternProvince_gdf.unary_union)]

# Convert back to DataFrame, remove geometry
cells = pd.DataFrame(geosubset_cells_gdf.drop(columns='geometry'))

# View DataFrame
print(cells.head())
print(len(cells))
C:\Users\mm-br\AppData\Local\Temp\ipykernel_40356\2078088909.py:21: DeprecationWarning:

The 'unary_union' attribute is deprecated, use the 'union_all()' method instead.
     latmin  latmax  lonmin  lonmax
254 -14.985 -14.970  30.225  30.240
255 -14.985 -14.970  30.240  30.255
493 -14.970 -14.955  30.225  30.240
494 -14.970 -14.955  30.240  30.255
495 -14.970 -14.955  30.255  30.270
18326

Step 3: Pull OpenCellID data for each grid cell.

Having reduced the number of grid cells to a (slightly) more manageable number, we can now loop through them to pull data through the OpenCellID API. The free version of the API has a limit of 5000 calls per day.

# Subset for API batching
subset1 = cells.iloc[0:5000, :]
subset2 = cells.iloc[5000:10000, :]
subset3 = cells.iloc[10000:15000, :]
subset4 = cells.iloc[15000:len(cells), :]
# Subset 1: Loop through cells to pull OpenCellID data
for i in range(len(subset1)):
  if (i + 1) % 1000 == 0:
    time.sleep(120)
  api_key = 'pk.763aaba1a5d1ced55fac8eb302d7828b'
  mcc = 645
  output_format = 'csv'
  latmin = subset1.iloc[i]['latmin']
  latmax = subset1.iloc[i]['latmax']
  lonmin = subset1.iloc[i]['lonmin']
  lonmax = subset1.iloc[i]['lonmax']
  link = (
   'https://www.opencellid.org/cell/getInArea?key={}&BBOX={},{},{},{}&'
   'mcc={}&format={}'.format(api_key, latmin, lonmin, latmax, lonmax, mcc, output_format))
  oci_request = requests.get(link)
  file_name = 'subset1.csv'
  with open(file_name, 'ab') as file:
    headers = oci_request.content.decode().splitlines()[0] # Extract the first line as headers
    file.write(f"{headers}\n".encode()) # Write headers to file
    data = oci_request.content.decode().splitlines()[1:] # Extract all lines after the headers
    for i in data: # Loop through each line of data
      file.write(f"{i}\n".encode()) # Write each line to the file, avoiding overwriting earlier lines
# Subset 2: Loop through cells to pull OpenCellID data
for i in range(len(subset2)):
  if (i + 1) % 1000 == 0:
    time.sleep(120)
  api_key = 'pk.763aaba1a5d1ced55fac8eb302d7828b'
  mcc = 645
  output_format = 'csv'
  latmin = subset2.iloc[i]['latmin']
  latmax = subset2.iloc[i]['latmax']
  lonmin = subset2.iloc[i]['lonmin']
  lonmax = subset2.iloc[i]['lonmax']
  link = (
   'https://www.opencellid.org/cell/getInArea?key={}&BBOX={},{},{},{}&'
   'mcc={}&format={}'.format(api_key, latmin, lonmin, latmax, lonmax, mcc, output_format))
  oci_request = requests.get(link)
  file_name = 'subset2.csv'
  with open(file_name, 'ab') as file:
    headers = oci_request.content.decode().splitlines()[0] # Extract the first line as headers
    file.write(f"{headers}\n".encode()) # Write headers to file
    data = oci_request.content.decode().splitlines()[1:] # Extract all lines after the headers
    for i in data: # Loop through each line of data
      file.write(f"{i}\n".encode()) # Write each line to the file, avoiding overwriting earlier lines
# Subset 3: Loop through cells to pull OpenCellID data
for i in range(len(subset3)):
  if (i + 1) % 1000 == 0:
    time.sleep(120)
  api_key = 'pk.763aaba1a5d1ced55fac8eb302d7828b'
  mcc = 645
  output_format = 'csv'
  latmin = subset3.iloc[i]['latmin']
  latmax = subset3.iloc[i]['latmax']
  lonmin = subset3.iloc[i]['lonmin']
  lonmax = subset3.iloc[i]['lonmax']
  link = (
   'https://www.opencellid.org/cell/getInArea?key={}&BBOX={},{},{},{}&'
   'mcc={}&format={}'.format(api_key, latmin, lonmin, latmax, lonmax, mcc, output_format))
  oci_request = requests.get(link)
  file_name = 'subset3.csv'
  with open(file_name, 'ab') as file:
    headers = oci_request.content.decode().splitlines()[0] # Extract the first line as headers
    file.write(f"{headers}\n".encode()) # Write headers to file
    data = oci_request.content.decode().splitlines()[1:] # Extract all lines after the headers
    for i in data: # Loop through each line of data
      file.write(f"{i}\n".encode()) # Write each line to the file, avoiding overwriting earlier lines
# Subset 4: Loop through cells to pull OpenCellID data
for i in range(len(subset4)):
  if (i + 1) % 1000 == 0:
    time.sleep(120)
  api_key = 'pk.763aaba1a5d1ced55fac8eb302d7828b'
  mcc = 645
  output_format = 'csv'
  latmin = subset4.iloc[i]['latmin']
  latmax = subset4.iloc[i]['latmax']
  lonmin = subset4.iloc[i]['lonmin']
  lonmax = subset4.iloc[i]['lonmax']
  link = (
   'https://www.opencellid.org/cell/getInArea?key={}&BBOX={},{},{},{}&'
   'mcc={}&format={}'.format(api_key, latmin, lonmin, latmax, lonmax, mcc, output_format))
  oci_request = requests.get(link)
  file_name = 'subset4.csv'
  with open(file_name, 'ab') as file:
    headers = oci_request.content.decode().splitlines()[0] # Extract the first line as headers
    file.write(f"{headers}\n".encode()) # Write headers to file
    data = oci_request.content.decode().splitlines()[1:] # Extract all lines after the headers
    for i in data: # Loop through each line of data
      file.write(f"{i}\n".encode()) # Write each line to the file, avoiding overwriting earlier lines

Step 4: Filter output for cell tower data.

The raw output from OpenCellID’s API includes repeat headers and many rows indicating that no cell towers were present in the specified coordinates. To make our later clustering analysis more efficient, we remove all rows except for those containing cell tower data.

# Generate list of files
file_list = glob.glob("C:/Users/mm-br/OneDrive - nd.edu/Documents/Mod 3/3 - Unstructured Data Analysis/Final Project/OpenCellID/*")

# Iterate through list of files
OpenCellID = []

for i in file_list:
  column_names = ['lat', 'lon', 'mcc', 'mnc', 'lac', 'cellid', 'averageSignalStrength',
       'range', 'samples', 'changeable', 'radio', 'rnc', 'cid', 'tac', 'sid',
       'nid', 'bid']
  temp = pd.read_csv(i, names = column_names, header=None)
  filename = i.split('OpenCellID\\')[1]
  OpenCellID.append(temp)

# Concatenate the separate csvs in the OpenCellID list
OpenCellID = pd.concat(OpenCellID)

# View
print(OpenCellID.head())
print(len(OpenCellID))

# Filter for rows with data
OpenCellID = OpenCellID[OpenCellID['mcc'] == '645']

# View
print(OpenCellID.head())
print(len(OpenCellID))

# Save
OpenCellID.to_csv('C:/Users/mm-br/OneDrive - nd.edu/Documents/Mod 3/3 - Unstructured Data Analysis/Final Project/OpenCellID.csv', index = False)
                lat              lon  mcc  mnc  lac  cellid  \
0               lat              lon  mcc  mnc  lac  cellid   
1  -14.976425170898  30.229568481445  645    2  105     981   
2              info             code  NaN  NaN  NaN     NaN   
3    No cells found                1  NaN  NaN  NaN     NaN   
4               lat              lon  mcc  mnc  lac  cellid   

   averageSignalStrength  range  samples  changeable  radio  rnc  cid  tac  \
0  averageSignalStrength  range  samples  changeable  radio  rnc  cid  tac   
1                    NaN   1000        1           1    GSM  NaN  NaN  NaN   
2                    NaN    NaN      NaN         NaN    NaN  NaN  NaN  NaN   
3                    NaN    NaN      NaN         NaN    NaN  NaN  NaN  NaN   
4  averageSignalStrength  range  samples  changeable  radio  rnc  cid  tac   

   sid  nid  bid  
0  sid  nid  bid  
1  NaN  NaN  NaN  
2  NaN  NaN  NaN  
3  NaN  NaN  NaN  
4  sid  nid  bid  
37625
                lat              lon  mcc mnc    lac cellid  \
1  -14.976425170898  30.229568481445  645   2    105    981   
5  -14.965438842773  30.235061645508  645   1  11106  20191   
6        -14.964527        30.239644  645   2  11106  20191   
7  -14.962692260742  30.236434936523  645   1  11105  20391   
9        -14.964242        30.240122  645   2  11117  20393   

  averageSignalStrength  range samples changeable radio  rnc  cid  tac  sid  \
1                   NaN   1000       1          1   GSM  NaN  NaN  NaN  NaN   
5                   NaN   1000       1          1   GSM  NaN  NaN  NaN  NaN   
6                   NaN  10629       1          1   GSM  NaN  NaN  NaN  NaN   
7                   NaN   1000       1          1   GSM  NaN  NaN  NaN  NaN   
9                   NaN   9346       1          1   GSM  NaN  NaN  NaN  NaN   

   nid  bid  
1  NaN  NaN  
5  NaN  NaN  
6  NaN  NaN  
7  NaN  NaN  
9  NaN  NaN  
1490

Cell Towers (red) in Eastern Province

Clustering

Having now gathered all required data, we begin the process of selection by clustering health facilities purely based on their geographic location. We use K-Means clustering for this process because it is fairly simple and the resulting clusters must be spherical (or circular in 2D)8, 9.

Part 5: K-Means Clustering

Step 1: Run K-Means with cluster radius limit.

Zipline drones can travel up to 80 km from a hub to make deliveries. To ensure that cluster radii do not exceed this limit, we run K-Means with a custom function to calculate and limit cluster radii based on the great circle distance of each facility from its cluster centroid. Great circle distance is used (as opposed to Euclidian distance, which is suitable for 2D planes) to account for the Earth’s curvature.

from sklearn.cluster import KMeans
from geopy.distance import great_circle

# Build function to calculate and limit cluster radius
def radius_constraint(facilities, labels, centroids, max_radius_km = 80):
  for i, centroid in enumerate(centroids):
    cluster_points = facilities[labels == i]
    centroid_coords = (centroid[0], centroid[1])

    radius = max(great_circle((lat, lon), centroid_coords).km for lat, lon in cluster_points)

    if radius > max_radius_km:
      return False
  return True

# Build function to find required no. of clusters (using radius_constraint function)
def find_clusters(MFL_VF, max_radius_km = 80):
  facilities = MFL_VF[['Latitude', 'Longitude']].values
  k = 1

  while True:
    kmeans = KMeans(n_clusters = k, random_state = 1, n_init = 'auto')
    labels = kmeans.fit_predict(facilities)
    centroids = kmeans.cluster_centers_

    if radius_constraint(facilities, labels, centroids, max_radius_km):
      break
    k = k + 1
  
  MFL_VF['Cluster'] = labels
  return MFL_VF, centroids

# Run clustering process
MFL_VF, centroids = find_clusters(MFL_VF, max_radius_km = 80)

# Convert centroids to DataFrame
clusters = pd.DataFrame(centroids, columns = ['Latitude', 'Longitude'])

# Add cluster labels
clusters['Cluster_No'] = range(0, len(centroids))

# Save
MFL_VF.to_csv('C:/Users/mm-br/OneDrive - nd.edu/Documents/Mod 3/3 - Unstructured Data Analysis/Final Project/MFL_VF_Clustered.csv', index = False)

clusters.to_csv('C:/Users/mm-br/OneDrive - nd.edu/Documents/Mod 3/3 - Unstructured Data Analysis/Final Project/clusters.csv', index = False)

Health Facilities Clustered by K-Means with 80 km Radii

Step 2: Basic aggregations by cluster.

To compare clusters, we perform simple aggreations on facility count, catchment population, distance, and duration. Note that summation cannot be performed on distance and duration due to sparsity of data.

# Load MFL_VF_Clustered
MFL_VF_Clustered = pd.read_csv('C:/Users/mm-br/OneDrive - nd.edu/Documents/Mod 3/3 - Unstructured Data Analysis/Final Project/MFL_VF_Clustered.csv')

# View
print(MFL_VF_Clustered.head())
                            Name Province    District           Type  \
0          Mankhaka  Health Post  Eastern     Lundazi    Health Post   
1       Kawaza Rural Health Post  Eastern      Katete    Health Post   
2  Bwanunkha Rural Health Centre  Eastern     Chadiza  Health Centre   
3   Chinambi Rural Health Centre  Eastern      Nyimba  Health Centre   
4      Mnoro Rural Health Centre  Eastern  Chipangali  Health Centre   

  Ownership Ownership type  Catchment population cso   Latitude  Longitude  \
0       MOH         Public                    8153.0 -12.288883  33.174417   
1       MOH         Public                    3524.0 -14.256272  32.164201   
2       MOH         Public                    5156.0 -13.934065  32.576443   
3       MOH         Public                    4924.0 -14.468860  30.670398   
4       MOH         Public                    9672.0 -13.525668  32.646855   

   Geocoded_Lon  Geocoded_Lat  Distance_KM  Duration_HR  Cluster  
0     33.177151    -12.288183     183.4942     3.030917        4  
1     31.946058    -14.093715      96.7518     1.184139        5  
2     32.575007    -13.932266      38.7530     1.233583        1  
3     30.581171    -14.379737          NaN          NaN        6  
4     32.646664    -13.525717      17.7136     0.319778        1  
# Basic aggregations
cluster_comparison = MFL_VF_Clustered.groupby('Cluster').agg({'Name': 'count', 'Catchment population cso': 'sum', 'Distance_KM': 'mean', 'Duration_HR': 'mean'})

# Rename columns
cluster_comparison = cluster_comparison.rename(columns = {'Name': 'Facilities_Count', 'Catchment population cso': 'Population_Sum', 'Distance_KM': 'Distance_KM_Mean', 'Duration_HR': 'Duration_HR_Mean'})

# Rounding for readability
cluster_comparison = cluster_comparison.round(2)

# View
print(cluster_comparison)

# Save
cluster_comparison.to_csv('C:/Users/mm-br/OneDrive - nd.edu/Documents/Mod 3/3 - Unstructured Data Analysis/Final Project/cluster_comparison.csv')
         Facilities_Count  Population_Sum  Distance_KM_Mean  Duration_HR_Mean
Cluster                                                                      
0                      71        455122.0            161.72              2.32
1                      84        276310.0             32.13              0.55
2                       9         46580.0            209.83              5.65
3                      31        255856.0            101.06              1.80
4                      41        307106.0            169.61              2.97
5                      68        536551.0             92.73              1.31
6                      39        180994.0            202.20              2.85

Step 3: Add network coverage by cluster.

With our clusters now established, we can calculate the percentage of the cluster area covered by 2G, 3G, and 4G networks. For simplicity, I performed this operation in ArcGIS and then exported the results as CSVs.

The process in ArcGIS involved: building buffer polygons over the cell tower points with the radii equivalent to their range from OpenCellID, then overlaying those polygons with the cluster polygons and performing union and intersect operations. I then calculated the area of the resulting polygons (which had been subset to only include the cluster area and avoid double-counting in cases of tower overlap) and divided by the total cluster area to get the percentage of coverage.

# Load cluster_comparison
cluster_comparison = pd.read_csv('C:/Users/mm-br/OneDrive - nd.edu/Documents/Mod 3/3 - Unstructured Data Analysis/Final Project/cluster_comparison.csv')
# Load OpenCellID_Cluster_Coverage
OpenCellID_Cluster_Coverage = pd.read_csv('C:/Users/mm-br/OneDrive - nd.edu/Documents/Mod 3/3 - Unstructured Data Analysis/Final Project/OpenCellID_Cluster_Coverage.csv')

# Combine with cluster_comparison
cluster_comparison['Overlap_Perc_2G'] = OpenCellID_Cluster_Coverage['Overlap_Perc_2G']
cluster_comparison['Overlap_Perc_3G4G'] = OpenCellID_Cluster_Coverage['Overlap_Perc_3G4G']

# View
print(cluster_comparison)
   Cluster  Facilities_Count  Population_Sum  Distance_KM_Mean  \
0        0                71        455122.0            161.72   
1        1                84        276310.0             32.13   
2        2                 9         46580.0            209.83   
3        3                31        255856.0            101.06   
4        4                41        307106.0            169.61   
5        5                68        536551.0             92.73   
6        6                39        180994.0            202.20   

   Duration_HR_Mean  Overlap_Perc_2G  Overlap_Perc_3G4G  
0              2.32         0.999623           0.001995  
1              0.55         0.805140           0.002692  
2              5.65         0.200214           0.000156  
3              1.80         0.783264           0.003772  
4              2.97         0.290631           0.001096  
5              1.31         0.909158           0.003338  
6              2.85         0.967982           0.001227  

Optimization

Having established our clusters and calculated all necessary decision criteria, we now move on to optimization. To choose the best cluster, we consider: 1) total number of facilities, 2) total catchment population, 3) average road distance to Chipata resupply hub, 4) average travel time to Chipata resupply hub, and 5) 2G network coverage. Note that 3G and 4G network coverage was less than 1% in all clusters and was therefore not considered significant.

We perform optimization using the PyMCDM library, which includes the Technique for the Order of Prioritisation by Similarity to Ideal Solution (TOPSIS) method for multi-criteria decision making10. The TOPSIS method takes three arguments: a matrix of alternatives, an array of criteria weights, and an array of criteria types (maximization or minimization). The method returns a score for each alternative and a ranking based on those scores.

Part 6: Initial Narrowing

We perform an intial round of optimization to select the three best clusters for further consideration. The decision weights are established as follows: 0.3 for facilities count, 0.3 for catchment population, 0.1 for distance, 0.1 for duration, and 0.2 for 2G network coverage. These weights are based purely on my own judgement and could be adjusted based on real business requirements. I give priority to facilities count and catchment population as measures of potential impact, followed by 2G network coverage as a measure of operational feasibility. Distance and duration are given less weight for reasons outlined in Part 3.

After identifying our top three clusters, we isolate them in ArcGIS for further analysis.

decision_matrix = np.array(cluster_comparison.iloc[:, 2:7])

decision_weights = np.array([0.3, 0.3, 0.1, 0.1, 0.2])

decision_types = np.array([1, 1, 1, 1, 1])

topsis = mcdm.TOPSIS()
scores = topsis(decision_matrix, decision_weights, types = decision_types)
ranking = np.argsort(scores)[::-1]+1

print("Scores: ", scores)
print("Ranking: ", ranking)
Scores:  [0.70548494 0.37130609 0.45803989 0.51063995 0.54916232 0.63269352
 0.54857545]
Ranking:  [1 6 5 7 4 3 2]

Semifinalist Clusters

Part 7: Final Selection

Having found our top three clusters, we now begin the process of final selection, which involves re-running the optimization model with updated criteria on each cluster.

Step 1: Re-classify health facilities within top three clusters (ArcGIS).

As we imagine the implementation of a Zipline hub at any of our clusters, we must consider that the hub would serve not only those facilities initially assigned to it by the K-Means clustering, but any facilities falling within its 80 km radius. To account for this, we reclassify facilities into clusters (allowing assignment of facilities to multiple clusters) based on their intersection with the 80 km buffer around each cluster centroid. This operation was performed mostly in ArcGIS for simplicity and then completed here after export to CSVs.

After identifying all servicable facilities for each cluster, we re-run our aggregations to prepare for optmization.

# Load
Cluster0Reselect = pd.read_csv('C:/Users/mm-br/OneDrive - nd.edu/Documents/Mod 3/3 - Unstructured Data Analysis/Final Project/Cluster0Reselect.csv')

Cluster4Reselect = pd.read_csv('C:/Users/mm-br/OneDrive - nd.edu/Documents/Mod 3/3 - Unstructured Data Analysis/Final Project/Cluster4Reselect.csv')

Cluster5Reselect = pd.read_csv('C:/Users/mm-br/OneDrive - nd.edu/Documents/Mod 3/3 - Unstructured Data Analysis/Final Project/Cluster5Reselect.csv')

# Update cluster number consistent with reclassification
Cluster0Reselect['Cluster'] = 0
Cluster4Reselect['Cluster'] = 4
Cluster5Reselect['Cluster'] = 5

# Combine
Clusters_Semifinal = pd.concat([Cluster0Reselect, Cluster4Reselect, Cluster5Reselect])

# Basic aggregations
Clusters_Semifinal_Comparison = Clusters_Semifinal.groupby('Cluster').agg({'Name': 'count', 'Catchment_population_cso': 'sum', 'Distance_KM': 'mean', 'Duration_HR': 'mean'})

# Rename columns
Clusters_Semifinal_Comparison = Clusters_Semifinal_Comparison.rename(columns = {'Name': 'Facilities_Count', 'Catchment_population_cso': 'Population_Sum', 'Distance_KM': 'Distance_KM_Mean', 'Duration_HR': 'Duration_HR_Mean'})

# Rounding for readability
Clusters_Semifinal_Comparison = Clusters_Semifinal_Comparison.round(2)

# Save
Clusters_Semifinal_Comparison.to_csv('C:/Users/mm-br/OneDrive - nd.edu/Documents/Mod 3/3 - Unstructured Data Analysis/Final Project/Clusters_SemiFinal_Comparison.csv')

# View
print(Clusters_Semifinal_Comparison)
         Facilities_Count  Population_Sum  Distance_KM_Mean  Duration_HR_Mean
Cluster                                                                      
0                     137        806686.0            146.49              2.08
4                      47        341480.0            164.04              2.87
5                     181        958262.0             92.08              1.34

Health Facilities Reclassified by Cluster (Cluster 0)

Health Facilities Reclassified by Cluster (Cluster 4)

Health Facilities Reclassified by Cluster (Cluster 5)

Step 2: Add network coverage by cluster.

# Load Clusters_Semifinal_Comparison
Clusters_Semifinal_Comparison = pd.read_csv('C:/Users/mm-br/OneDrive - nd.edu/Documents/Mod 3/3 - Unstructured Data Analysis/Final Project/Clusters_Semifinal_Comparison.csv')
# Load OpenCellID_Cluster_Coverage
OpenCellID_Cluster_Coverage = pd.read_csv('C:/Users/mm-br/OneDrive - nd.edu/Documents/Mod 3/3 - Unstructured Data Analysis/Final Project/OpenCellID_Cluster_Coverage.csv')

# Combine with Clusters_Semifinal_Comparison
Clusters_Semifinal_Comparison['Overlap_Perc_2G'] = OpenCellID_Cluster_Coverage['Overlap_Perc_2G'].iloc[[0, 4, 5]].reset_index(drop = True)

Clusters_Semifinal_Comparison['Overlap_Perc_3G4G'] = OpenCellID_Cluster_Coverage['Overlap_Perc_3G4G'].iloc[[0, 4, 5]].reset_index(drop = True)

# View
print(Clusters_Semifinal_Comparison)
   Cluster  Facilities_Count  Population_Sum  Distance_KM_Mean  \
0        0               137        806686.0            146.49   
1        4                47        341480.0            164.04   
2        5               181        958262.0             92.08   

   Duration_HR_Mean  Overlap_Perc_2G  Overlap_Perc_3G4G  
0              2.08         0.999623           0.001995  
1              2.87         0.290631           0.001096  
2              1.34         0.909158           0.003338  

Step 3: Final selection.

We now complete the final selection process by running the optimization model on our three semifinalist clusters. The decision weights remain the same as in the initial narrowing process.

decision_matrix = np.array(Clusters_Semifinal_Comparison.iloc[:, 2:7])

decision_weights = np.array([0.3, 0.3, 0.1, 0.1, 0.2])

decision_types = np.array([1, 1, 1, 1, 1])

topsis = mcdm.TOPSIS()
scores = topsis(decision_matrix, decision_weights, types = decision_types)
ranking = np.argsort(scores)[::-1]+1

print("Scores: ", scores)
print("Ranking: ", ranking)
Scores:  [0.67635659 0.45803989 0.53962091]
Ranking:  [1 3 2]

Best Cluster (Cluster 0)

Results

The optimization model recommends Cluster 0 based on our criteria. This appears logical from looking at the map - Cluster 0 covers quite a few health facilities and therefore could serve a large catchment population. It is located far enough from Chipata (located at the eastern edge of the province about halfway up) to benefit from drone delivery over road transport, but not so far as to suffer from low network coverage.

Should Zipline place a hub at the centroid of this cluster, the impact would be significant: a population of over 806k and 137 facilities served, with excellent operational feasibility with 99.96% 2G network coverage.

Discussion

This exercise relied heavily on external data sources, and is therefore subject to limitations related to their quality. We saw, for example, that the downloaded Master Facility List for Eastern Province from the Zambia Ministry of Health included facilities whose coordinates fell outside the province’s boundaries. It is possible that those facilities do belong in Eastern Province, and their coordinate were simply inaccurate, in which case our calculations of total facilities served may not be perfectly accurate. Similarly, routing calls to the OpenRouteService API failed roughly 50% of the time due to “unroutable coordinates” despite georeferencing. Our estimations of distance and duration are therefore based on only partial information.

It may also be worth exploring alternative clustering methods, particularly those that handle classification to more than one cluster (non-partitional methods).

Footnotes

  1. Zipline. (n.d.). Zipline. https://www.flyzipline.com/↩︎

  2. United Nations Development Program. (n.d.). Human Development Insights. https://hdr.undp.org/data-center/country-insights#/ranks↩︎

  3. City Population. (2024). Zambia: Administrative Division. https://www.citypopulation.de/en/zambia/admin/↩︎

  4. Country Reports. (n.d.). Traffic and Road Conditions in Zambia. https://www.countryreports.org/country/Zambia/traffic.htm↩︎

  5. ZAMMSA. (n.d.). ZAMMSA. https://zammsa.co.zm/↩︎

  6. USAID. (2023). The Next Step in Planning Efficient Distribution of Health Commodities. https://ghsupplychain.org/sites/default/files/2023-11/00191_GHSCS_DispatchOptimizationTool_TechBrief_1.pdf↩︎

  7. Spectrum. (2019). In the Air with Zipline’s Medical Delivery Drones. https://spectrum.ieee.org/in-the-air-with-ziplines-medical-delivery-drones↩︎

  8. Real Python. (n.d.). K-Means Clustering in Python: A Practical Guide. https://realpython.com/k-means-clustering-python/↩︎

  9. Scikit-learn. (n.d.). KMeans. https://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html↩︎

  10. PyPi.org. (n.d.). PyMCDM. https://pypi.org/project/pymcdm/?utm_source=chatgpt.com#c1↩︎